Goto

Collaborating Authors

 sample correlation


Are You Stealing My Model? Sample Correlation for Fingerprinting Deep Neural Networks

Neural Information Processing Systems

An off-the-shelf model as a commercial service could be stolen by model stealing attacks, posing great threats to the rights of the model owner. Model fingerprinting aims to verify whether a suspect model is stolen from the victim model, which gains more and more attention nowadays. Previous methods always leverage the transferable adversarial examples as the model fingerprint, which is sensitive to adversarial defense or transfer learning scenarios. To address this issue, we consider the pairwise relationship between samples instead and propose a novel yet simple model stealing detection method based on SAmple Correlation (SAC). Specifically, we present SAC-w that selects wrongly classified normal samples as model inputs and calculates the mean correlation among their model outputs. To reduce the training time, we further develop SAC-m that selects CutMix Augmented samples as model inputs, without the need for training the surrogate models or generating adversarial examples. Extensive results validate that SAC successfully defends against various model stealing attacks, even including adversarial training or transfer learning, and detects the stolen models with the best performance in terms of AUC across different datasets and model architectures. The codes are available at https://github.com/guanjiyang/SAC.


Are You Stealing My Model? Sample Correlation for Fingerprinting Deep Neural Networks

Neural Information Processing Systems

An off-the-shelf model as a commercial service could be stolen by model stealing attacks, posing great threats to the rights of the model owner. Model fingerprinting aims to verify whether a suspect model is stolen from the victim model, which gains more and more attention nowadays. Previous methods always leverage the transferable adversarial examples as the model fingerprint, which is sensitive to adversarial defense or transfer learning scenarios. To address this issue, we consider the pairwise relationship between samples instead and propose a novel yet simple model stealing detection method based on SAmple Correlation (SAC). Specifically, we present SAC-w that selects wrongly classified normal samples as model inputs and calculates the mean correlation among their model outputs.


Sample Correlation for Fingerprinting Deep Face Recognition

arXiv.org Artificial Intelligence

Noname manuscript No. (will be inserted by the editor) Abstract Face recognition has witnessed remarkable JC to previous methods. However, an off-theshelf Keywords Model Fingerprinting Deep Face face recognition model as a commercial service Recognition could be stolen by model stealing attacks, posing great threats to the rights of the model owner. Model fingerprinting, as a model stealing detection method, aims 1 Introduction to verify whether a suspect model is stolen from the victim model, gaining more and more attention nowadays. In recent years, remarkable advancements in face recognition Previous methods always utilize transferable adversarial have been largely attributable to the development examples as the model fingerprint, but this of deep learning techniques [1]. A common practice for method is known to be sensitive to adversarial defense model owners is to offer their models to clients through and transfer learning techniques. To address this issue, either cloud-based services or client-side software. Generally, we consider the pairwise relationship between samples training deep neural networks, especially deep face instead and propose a novel yet simple model stealing recognition models, is both resource-intensive and financially detection method based on SAmple Correlation burdensome, requiring extensive data collection (SAC).


Linear Polytree Structural Equation Models: Structural Learning and Inverse Correlation Estimation

arXiv.org Machine Learning

Over the past three decades, the problem of learning directed graphical models from data has received enormous amount of attention since they provide a compact and flexible way to represent the joint distribution of the data, especially when the associated graph is a directed acyclic graph (DAG). A directed graph is called a DAG if it does not contain directed cycles. DAG models are popular in practice with applications in biology, genetics, machine learning and causal inference (Sachs et al., 2005; Zhang et al., 2013; Koller and Friedman, 2009; Spirtes et al., 2000). There exists an extensive literature on learning the graph structure from data under the assumption that the graph is a DAG. For a summary, see the survey of Drton and Maathuis (2017); Heinze-Deml et al. (2018). Existing approaches generally fall into two categories, constrain-based methods (Spirtes et al., 2000; Pearl, 2009) and score-based methods (Chickering, 2002). Constraint-based methods utilize conditional independence test to determine whether there exists an edge between two nodes and then orient the edges in the graph, such that the resulting graph is compatible with the conditional independencies seen in the data. Score-based methods formulate the structure learning task as optimizing a score function based on the unknown graph and the data. A polytree is a DAG which does not contain any cycles even if the directions of all edges are ignored.


Generalized Label Enhancement with Sample Correlations

arXiv.org Machine Learning

Recently, label distribution learning (LDL) has drawn much attention in machine learning, where LDL model is learned from labeled instances. Different from single-label and multi-label annotations, label distributions describe the instance by multiple labels with different intensities and accommodates to more general conditions. As most existing machine learning datasets merely provide logical labels, label distributions are unavailable in many real-world applications. To handle this problem, we propose two novel label enhancement methods, i.e., Label Enhancement with Sample Correlations (LESC) and generalized Label Enhancement with Sample Correlations (gLESC). More specifically, LESC employs a low-rank representation of samples in the feature space, and gLESC leverages a tensor multi-rank minimization to further investigate sample correlations in both the feature space and label space. Benefit from the sample correlation, the proposed method can boost the performance of LE. Extensive experiments on 14 benchmark datasets demonstrate that LESC and gLESC can achieve state-of-the-art results as compared to previous label enhancement baselines.


Automatic Classifiers as Scientific Instruments: One Step Further Away from Ground-Truth

arXiv.org Machine Learning

Automatic detectors of facial expression, gesture, affect, etc., can serve as scientific instruments to measure many behavioral and social phenomena (e.g., emotion, empathy, stress, engagement, etc.), and this has great potential to advance basic science. However, when a detector $d$ is trained to approximate an existing measurement tool (e.g., observation protocol, questionnaire), then care must be taken when interpreting measurements collected using $d$ since they are one step further removed from the underlying construct. We examine how the accuracy of $d$, as quantified by the correlation $q$ of $d$'s outputs with the ground-truth construct $U$, impacts the estimated correlation between $U$ (e.g., stress) and some other phenomenon $V$ (e.g., academic performance). In particular: (1) We show that if the true correlation between $U$ and $V$ is $r$, then the expected sample correlation, over all vectors $\mathcal{T}^n$ whose correlation with $U$ is $q$, is $qr$. (2) We derive a formula to compute the probability that the sample correlation (over $n$ subjects) using $d$ is positive, given that the true correlation between $U$ and $V$ is negative (and vice-versa). We show that this probability is non-negligible (around $10-15\%$) for values of $n$ and $q$ that have been used in recent affective computing studies. (3) With the goal to reduce the variance of correlations estimated by an automatic detector, we show empirically that training multiple neural networks $d^{(1)},\ldots,d^{(m)}$ using different training configurations (e.g., architectures, hyperparameters) for the same detection task provides only limited `coverage' of $\mathcal{T}^n$.